By Julien Hernandez Lallement, 2020-06-03, modified 2020-06-03, in category Opinion
I have worked 10 years in academic environments. During that time, I moved across labs located in Italy, France (2 of them) Germany (2 of them) and The Netherlands. I got published quite well, and got a Marie Curie grant for my PostDoc, quite a prestigious funding that would have facilitated me staying in academia.
At some point however, I grew tired of academic environment. Mixture of working with animals (ethically discutable), long hours, medium salary, contract unstability, and the feeling of contributing to something not quite helpful to "make the world a better place"...
In research, my main focus was neuroscience, but overall, I got my head deep enough into data of all kind to see the beauty of it. One logical step for me was to move in the field of data science, in the private sector. My long term goal being to get a position in a company where I could do some work to help others and the planet... That felt like a big jump in cold water, like the germans say. In retrospect, it was relatively easier than I expected, but I admit that I prepared a lot for that change.
I write this article for other PhDs and PostDocs out there, that might feel like a change. Maybe telling you how I did it (you"ll find many of these stories on the web) could give some tips or inspiration. I anyway feel like writing it down, as long as it's still fresh. I will discuss some aspects that I found important when moving from academia to the private sector, and reflect on others that ended up not being so important.
One thing should be clear: I was not a data guy. Hell, I sometimes doubt I am now! In fact, I started knowing nothing about it. Most thing I knew when I started my master internship was about means and standard deviations, and even this latter concept was not easy for me to grasp. Forget about parametric statistical testing, ANOVA and what not. I was not much better when I started my actual PhD... Point is: if you relate, don't worry. That stuff is complicated. Everyone struggles with it. Learning by doing, that's how I got some grasp on data concepts. Of course you can read, prepare, do online courses. All of that helps. But it's not because you don't understand it, that you should not try to play with it. At least that's how I see it :)
Don't worry if you do not feel like you have enough knowledge about data. There is too much out there, and the most important is to start doing. Everyone suffers from the impostor syndrome ;)
Now, if you are reading this, I guess you are an academic. Then you probably learned to collect and extract data, visualize and explore, and do some basic or maybe more complex statistical testing, probably frequentist. That's all good, and it can work beautifully. What I learned when I started working in the private sector though, was that all this fancy preparation of data, doing power analysis to use the right sample, making sure all assumptions were fullfiled when comparing distributions...not so relevant. Don't get me wrong, you should still do it! But no one will look at whether what you are saying is sound. No peer-review. Very results-oriented environment, due of course to the heightened rythm inherent to most businesses. This means that in this so-called "real world" situations, you might have to often find a balance between rigor in your work, and meeting deadlines/showing your work to others. This can be quite fundamentally different that in research, where we are quite perfectionistic, and aim at understanding processes the best we can, before publishing. In the private sector, I found that a cumulative, incremental process works much better. I aim at providing deliverables, not quite yet providing a finished product (data pipes), but some kind of prototype that demonstrates the value of the project. Adding up functionnalities and complexity comes with the time. That means that it is okay to have models or data products that are rather approximative at first as long as 1) you are honest about the limitations and caveats and 2) keep in increasing the accuracy and usefulness of your product. In other words, it's better to start with a low-level product, a primitive form of what you have in mind, but that already can produce some added-value. You can improve it by testing, and by integrating feedback from others. That goes along with an Agile framework, which prones a more flexible work flow done by self managing teams, where expertise is rather distributed than focused on particular individuals. That brings me to my next point.
Get used to showing your work, even at very unmature stage, and think about explaining how keeping on will benefit your projects and goals on the long run. If possible, produce a prototype for others to use and experiment on (as an example, an early version of a ML model whose predictions are a bit off but already better than heuristics. Be always honest about the limitations (in our example, mention that some data features are missing).
Now, back when I was a PhD, the term "Project Management" activated a mental representation of a suit guy, probably wearing sunglasses, a carrying a briefcase. Too much TV... With that in mind, I thought that succeeding at the transition outside academia meant gaining some project management skills. I was right, but not for the reasons I thought, and here is why: in my opinion, rather than a job, project management is an attitude, a tool, a mindset that one can have in its work. There is some interesting pieces like this one suggesting that project management might not always be a job per se, but a polyvalent skillset that can be combined with more precise work. For example, McKevitt et al. conclude that a project manager is a "jack-of-all-trades". Sure, in some cases, project management is a career on its own! I am not trying to dismiss project managers as a whole. What I mean is that you can still do quite an amount of management on your projects, and if people know some tools and are told to take the time for it, you might need less project managers in your organization.
In research...never heard about that! I managed my projects by writing down some ToDo lists on pieces of paper, and documented almost nothing (except for some code, and even then, poorly). That changed during my PostDoc, which is when I got familiar with Agile and Waterfall methodologies by attending project management courses, to help me structure my work. I did not do this aiming at exiting academia, but rather as a new tool in my skillset. i found out later that these skills were very useful, across all work domains.
I first attended, and got certified as a SCRUM master. I won't go into to much details, but that is connected to the Agile framework I mentioned earlier. SCRUM is a standard that allows team to work dynamically and incrementally. It is very well suited for some businesses, in particular IT, but not only. That course was very hard at first... many words were unfamiliar to me, and most attendees were IT people that had no idea what it meant to be a researcher like me.
During the course, I heard about "waterfall project management", and mostly, heard it was bad. Because I am quite curious, I decided to read about that and found out that this method is still the most used, but things are changing fast. Basically, Agile is the new kid on the block by showing that a flexible methodology is better suited to some domains. Still, I decided to get some teachings. I attended, and ultimately passed the exam for PMP (Project Manager Professional), of the Project Management Institute (PMI). And that was great! I do agree with some critics of this system and of its unflexibility. However, learning about waterfall project management really gets you started on basic concepts of which activities you should carefully manage, such as costs, schedule and communication. It will give you tools and ideas on how to do so, and overall, I found it very inspiring. I still work rather Agile, but the way I document and manage the project has a very waterfally smell to it.
Learn some project management. You don't need to become a PMP, that takes quite some (boring) studying. But inform yourself, and try to structure and document your project in such a way that a newcomer would only have to go through your notes to understand what has been done.
I don't know about you, but back when I was in research, I was pretty much my own boss. I had a few projects, that fell under a grant protocol that was quite flexible. I managed my projects and worked hard to get data, and get things moving forward. I was used to get things done fast, and when problems arose, I would find a relatively quick fix, a week tops, often less. Well, things are different in the private sector. Now again, I am working in a relatively big company, things might be different in a startup, or smaller company. But larger structures will have higher number of employees, each one that brought some idiosyncracies with them. People have different schedules, constraints, and sometimes agenda. You might find yourself in need of some data, quite easy to get, but waiting some weeks for to get access to it. Things go slower than in academia, although the day-to-day work seems much faster, due to the high project volatility. By the same token, you might be involved in many more projects than in academic environments, so all in all, you will still be quite busy, if not more ;)
My first position outside academia was in a company transitioning into the digital system, and where data-driven decisions were mostly made from excel graphs. Nothing wrong about that, it can do wonderful things in the right hands. However, modern data-related work follows some standards, documentation, version control, etc...all these things can be quite new for colleagues less versed into data than you are. Don't be scared in showing what you can, the advantages of new systems and softwares. By the same token, don't be scared in changing the way you work. Git can be quite daunting at first, but is a must if you work with complex data products, which you most certainly will as a data scientist/engineer. Similarly, many of us data scientits look down on Excel, forgetting that simplicity and interpretability has quite some value as well...
Bring your skillset with you, and show how you worked in the past. Some tools might not be adapted to this new environment, but others might really be a game changer for some people. Similarly, be open to new practices and feedback ;)
That's it for now. I will probably post another article once some time has passed, or edit this one with new ideas and experiences. Feel free to hit me if you have any questions, always glad to help!